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§1  Introduction 


In  this  paper  we  present  results  on  the  problem  of  estimating  variable 
(in  space  and  time)  coefficients  in  general  parabolic  models.  Our  efforts 
were  motivated  by  consideration  of  fundamental  transport  equations  (based  on 
mass  balance)  such  as 

n-H  $  ♦  &  W  -  &  £>  ♦ 

0  <  x  <  1,  t  >  0,  that  often  arise  in  population  dispersal  models  [12],  [19], 
[22],  [23]  as  well  as  in  more  classical  applications  involving  material  trans¬ 
port.  In  the  case  of  species  dispersa1  models,  the  term  involving  l /  represents 
a  directed  movement  mechanism  (convective,  advective,  attractive,  chemotatic, 
etc.  phenomena),  V  is  the  coefficient  of  diffusion  under  the  usual  Fick's  first 
law  formulation,  and  g  represents  general  sink/source  mechanisms  (death/birth, 
emigration/immigration,  etc.)  while  u  is  the  population  density.  Recently 
such  equations  (with  spatially  dependent  V  and  constant  V)  have  been  successfully 
employed  in  studies  of  fiea  beetle  movement  [12]  and  transport  of  labeled 
substances  in  brain  tissue  [12],  [28].  However,  as  pointed  out  in  [12],  some 
of  those  investigations  were  limited  somewhat  by  an  assumption  that  V  and  V 
be  constant  in  time.  In  the  case  of  insect  dispersal,  the  fact  that  insects 
are  ectotherms  and  are  very  sensitive  to  weather  leads  to  a  natural  expectation 
that  their  movement  rates  will  vary  temporally  when  observations  are  taken 
over  several  days  or  more.  To  model  behavior  in  extended  field  situations,  one 
must  therefore  employ  equations  with  time  varying  coefficients  (Example  5.4 
in  Table  3  of  [12]  illustrates  how  dramatically  one  might  expect  the  estimated 
diffusion  coefficient  to  vary  when  determined  from  experimental  data  for  1  day  vs 
that  obtained  with  data  for  3  days.)  As  explained  in  [12],  [23]  there  are  also 
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situations  in  which  one  should  allow  for  spatial  dependence  in  the  coefficients. 

We  present  below  techniques  for  estimating  nonconstant  (i.e.,  functional)  coef¬ 
ficients  in  a  general  class  of  parabolic  equations  which  includes  equations  such  as 
(1.1)  where  l/=l/(t,x),  P=P(t,x),  and  g(t,x,u)  =  k(t,x)u.  We  are  the  first,  to 
our  knowledge  (e.g.,  see  the  surveys  [15],  [24]),  to  give  a  complete  treatment  con¬ 
sisting  of  algorithms  with  convergence  proofs  as  well  as  numerical  results  for  these 
variable  coefficient  estimation  problems  (some  preliminary  findings  in  this  area 
were  announced  in  our  earlier  note  [8]).  The  methods  we  develop  are  based  on 
cubic  spline  function  approximations  and  can  be  correctly  viewed  as  extensions 
of  those  developed  in  [7],  [12].  However,  the  fact  that  we  wish  to  treat  non- 
autonomous  systems  means  that  the  semigroup  approach  used  in  [7],  [12]  to 
obtain  theoretical  convergence  results  for  problems  involving  autonomous 
parabolic  equations  is  not  directly  applicable.  (An  analogue  of  the  Trotter- 
Kato  convergence  theorem  --  see  [7]  —  which  can  be  used  to  treat  evolution 
equations  :an  be  found  in  the  factor  space  methods  developed  in  [20,  Chap.  V]; 
however,  the  use  of  an  approximating  evolution  operator  framework  for  the 
problems  under  consideration  here  would  not  offer  any  simplification  in 
conceptual  or  technical  details.)  Fortunately,  an  alternate  approach  that  we 
have  employed  with  success  [1],  [9],  [16],  [17]  in  treating  estimation  and 
control  problems  for  delay  systems  may  be  applied  here,  avoiding  the  semigroup 
evolution  operator  theories  altogether.  Our  conceptual  framework  (in  which 
the  "consistency  plus  stability  implies  convergence"  steps  of  the  Lax 
Equivalence,  Trotter-Kato,  factor  space  theories  can  also  be  clearly  identified) 
relies  on  properties  enjoyed  by  certain  approximation  schemes  when  applied 
to  dissipative  operators,  a  simple  application  of  Gronwall's  inequality,  and 
basic  spline  approximation  estimates.  While  we  consider  only  linear  systems 
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in  this  presentation,  our  ideas  and  methods  can  be  extended  to  also  treat 
estimation  and  control  problems  involving  certain  nonlinear  systems  of  practical 
interest  (for  example,  see  [1],  [7],  [8],  [9],  [13],  [17]). 

In  the  presentation  below  we  consider  systems  of  the  form 
2 

H  =  q^t.x)  ~  +  q2(t,x)  +  q3(t,x)u  +  f (t,x,q) 

3  X 

of  which  the  linear  system  (1.1)  (with  g  =  ku)  can  be  shown  to  be  a  special 
case  (i.e.,  carrying  out  the  differentiations  in  (1.1),  we  have  q-j  =  V, 
q2  =  Px  •  and  q3  =  ~  +  k  so  that  knowledge  of  q^,  q2,  q3  yields  P,  l/,  k). 

We  first  show  in  section  2  that  such  systems  can  be  equivalently  viewed  in 
an  abstract  framework;  approximate  estimation  problems  are  formulated  and 
convergence  results  given  in  section  3.  In  the  final  section  we  discuss 
implementation  of  the  proposed  methods  and  present  a  summary  of  our  numerical 
findings  for  a  number  of  test  examples. 
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§2.  Formulation  of  the  parameter  estimation  problems 

We  turn  then  to  consideration  of  the  problem  of  estimating  the  vector  of 
unknown  variable  coefficients,  q(t,x) = (q^ (t,x),  ....  qn(t,x)),  appearing  in 


the  parabolic  equation 

3U  _ 


32U 


(2.1)  ~  =  q-j (t,x)  ~  +  q2(t,x)  +  q3(t,x)u  +  f(t,x,q(t,x)) , 

3x 

on  U  =  (0,T)  x  (0,1),  with  Dirichlet  boundary  conditions 

(2.2)  u(t,0)  =  u (t , 1 )  =  0 
and  initial  condition 

(2.3)  u(0,x)  =  <}i(x). 

We  assume  throughout  that  the  parameter  q  =  (q-j,...,qn)  belongs  to  Q,  where  Q 
is  a  given  compact  (in  the  l_2  topology)  subset  of 

£(  m,M)  =  ^(q-,,...,qn)  €  L2(U)x. . .  x|_2(U)  ( 0  <  m  £  q^t.x)  £  M, 

I  3ql 


3x 


(t,x) 


<  M,  |  q  j  (t,x)  |  £  M,  j  =  2,3;  for  i  =  l,2,...,n. 


32q, 


3qi  - 

-rr—  and  — s-  are  uniformly  Holder  continuous  on  U  with 
3t  3X*" 


exponents  in  (0,1) 


for  some  positive  constants  m,M.  We  further  assume  that  we  are  given  observations 

y.j  €  L2(0,1)  for  the  system  (2.1)-(2.3)  at  discrete  times  t^ ,  i  =  l,...,r 

(t,  >  0,  t  <  T),  and  an  output  map  Y(t,x,q)  :  R  -►  R  such  that  Y  is  continuous 
»  ' 

in  q  and  such  that  the  map  x  Y(  t  -  ,x,q)v(x)  is  in  L2(0,1)  whenever  v£  L2  (0,1 ) . 
The  fundamental  identification  problem  we  consider  in  this  paper  then  consists 
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4 

of  determining  a  q  £  Q  that  minimizes  a  distributed  least  squares  fit-to-data 
^  functional  given  by 

|  (2.4)  J(q)  =  l  /  |y.(x)  -  Y(t. ,x,q(t. ,x))u(t. ,x;q) |2dx  , 

1=1  0  1  1  1  1 

r 

where  u  is  the  solution  to  (2.1)-(2.3)  corresponding  to  q  £Q. 

We  make  the  following  standing  assumptions  regarding  Q  and  the  system 

(2.1)- (2. 3): 

For  each  q  £Q  the  perturbation  function  g(t,x)  =  f (t,x,q(t,x) )  is 
2 

such  that  and  are  uniformly  Holder  continuous  on  U  with 
exponents  in  (0,1). 

For  each  q  €  Q,  f (0,0,q(0,0)) »  f (0,1 ,q(0,l ) )  =  0. 

The  function  q  -*  f(*,*,q)  is  continuous  as  a  mapping  from 
^(Ujx...  x  L2(u)  to  [^(U). 

The  initial  function  $  is  in  Hg(0,l). 

Standard  results  from  the  theory  of  parabolic  partial  differential  equations 
may  be  called  upon  to  guarantee  that,  under  hypotheses  (H1)-(H4),  there  exists 
a  unique  (classical)  solution  u  (on  [0,T]  x  [0,1])  to  (2.1)  to  the  initial¬ 
boundary  value  problem  (2.1)-(2.3)  (see,  for  example,  Theorem  7,  p.  65  of 

2 

[18]).  In  fact,  using  such  results,  we  find  that  -Mj-  and  ~  are  uniformly 

3x  ^ 

continuous  functions  on  U. 

We  note  that  our  use  of  homogeneous  boundary  conditions  (2.2)  with 
equation  (2.1)  does  not  restrict  the  generality  of  our  results  since  the 
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corresponding  nonhomogeneous  boundary  value  problem, 


(2.5) 


~  =  q-j (t,x)  +  q2(t,x)  +  q3(t,x)u  +  f (t,x,q(t,x) ) 

3  X 

u(t,0)  =  uQ(t) 
u(t,l)  =  u i ( t ) 
u(0,x)  =  <j>(x) , 


can  be  transformed  into 


(2.6) 


2 

fj  =  qi(t,x)  +  q2(t,x)  +  q3(t,x)v  +  F(t,x,q(t,x) ) 

3  X 


v(t,0)  =  v ( t , 1 )  =  0 


v(0,x)  =  <f>(x) 


by  letting  v(t,x)  =  u(t,x)  -  (1-x)uQ(t)  -  xu^t)  .  In  the  event  that  f  satisfies 

(H1)-(H3)  and,  for  i  =  0,1, u.  is  a  uniformly  Holder  continuous  function  on  (0,T) 

■ 

satisfying  u^(0)  =  u ^ ( 0 )  =  0,  it  is  easy  to  see  that  the  new  perturbing  function 


given  by 


F(t,x,q(t,x))  =  f ( t, x ,q( t,x) )  -  qg ( t ,x) ( uQ( t )  -  u^t)) 

+  q3(t,x)((l-x)u0(t)  +  xu^t))  -  (l-x)uQ(t)  -  xu i ( t ) 

also  satisfies  hypotheses  (HI )-(H3) .  A  similar  statement  may  be  made  about 
the  case  when  u(t,0)  =  q^Ug(t)  and  u(t,l)  =  q^u^(t),  where  q4  and  q^  are 
(constant)  components  of  the  unknown  parameter  vector  q. 

If  f  does  not  depend  on  q  it  is  possible  to  relax  the  differentiability 
requirements  for  f  in  the  spatial  variables  (as  defined  in  (HI))  although 
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modest  changes  must  be  made  in  the  theoretical  arguments  of  section  3.  For 
details  on  the  alternative  hypotheses  and  an  indication  of  the  required 
modifications,  see  Remark  3.1  in  that  section. 

Having  stated  the  fundamental  estimation  problem,  we  next  consider  an 

abstract  setting  that  might  facilitate  our  treatment  of  the  problem.  We  shall 

-N 

approximate  solutions  q  of  this  problem  by  a  sequence  {q  }  of  solutions  to 
estimation  problems  that  are  computationally  more  tractable  than  our  original 
problem.  The  approach  taken  here  is  similar  in  spirit  to  that  of  a  number  of 
other  related  efforts  (see  [1],  [4],  [5],  [6],  [7],  [9], [13]  and  [16])  in 
that  the  original  estimation  problem  is  reformulated  in  a  Hilbert  space  setting 
where  Ritz-Galerkin  type  ideas  may  be  applied  to  construct  the  approximate 
problems.  To  this  end,  we  rewrite  (2.1  )-(2.3)  as  an  abstract  evolution  equation 
(AEE)  in  an  infinite-dimensional  state  space;  although  the  use  of  spaces 
and  operators  here  is  quite  standard  and  well-established  in  the  literature, 
the  dependence  of  our  problem  on  unknown  parameters  requires  that  we  make  an 
effort  to  carefully  define  the  operators  involved. 

Define  2  =  l_2(0,l)  with  the  usual  inner  product  <•»•>  and  norm  |  •  | .  For 
q  =  ( q i , . . .  ,qn)  G  Q,  and  t€  [0,T],  let  the  operator  A(t,q)  :  V  ■*  Z  be  given  by 

(2.7)  A(t,q)*  =  q-j  ( t ,  •  )D^<|»  +  q2(  t ,  *  )D^  +  q3(t,*)'l» 

for  4/  e  dom(A(t,q) )  =  V  =  H^ (0,1)  O  h! (0,1),  where  D  =  —  is  the  usual  dif- 
ferentiation  operator.  Finally,  for  each  q  £  Q  and  t  e  [0,T],  define 
G(t,q)  =  f ( t , • ,q( t , • ) ) £  Z.  Before  we  give  the  equivalence  between  (2.1)-(2.3) 
and  an  AEE  on  Z,  we  shall  establish,  for  the  operator  A(t,q),  a  dissipative 
inequality  that  is  fundamental  to  the  calculations  that  follow. 


Lemma  2.1.  There  is  a  constant  u>  >  0  such  that 


(2.8)  <A(t  <_  oj|  |  ^ 

for  all  i p  €  V,  uniformly  in  t  e  [0,T]  and  q  £  Q. 

Proof:  For^efl,  qe  Q  and  t  £  [0,T], 

2 

<A(t,q)ij>,ijj>  =  <q-j D‘  +  <q2Dif; ,4/>  +  <q^^ ,i^>  , 

where  q.  is  understood  to  mean  q.(t,-)  for  i  =  1,2,3,  Using  integration  by 
parts  and  the  boundary  conditions  on  \p  we  obtain 

2 

<q]D  \|> , tJ>>  =  -  <D^,D(q-|ip)> 


< 


l4  ( Dq i  )^ | 


2 


In  addition, 


<q2D\j*,^>  =  <q12D^,  -f  <fr> 


3q1 

Combining  these  estimates  with  bounds  on  qi ,  i  =  1,2,3,  and  given  in  the 
definition  of  Q,  we  find 

<A(t,q)i|i,i|»>  \\  (Dq,)ip|2  +  %  |-|  ^|2  +  M|<|;|2 

qf  qf 

M2 

so  that  uj  =  —  +  M  can  be  chosen  independent  of  t  and  q£  Q  . 
m 

In  the  theorem  that  follows  we  establish  the  equivalence  between  (2. 1 )- (2. 3) 
and  an  AEE  on  Z,  obtaining  some  needed  continuous  dependence  results  as  well. 

Theorem  2.1  Let  q  €  Q  be  fixed  and  let  y(t,q)  =  u(t,*;q)  where  (t,x)  ■*  u(t,x;q) 
is  the  solution  to  (2.1 )-(2. 3)  on  U.  Then  y  is  also  the  unique  solution  on 
[0,T]  of 

t 

(2.9)  z(t)  =  0  +  /  {A(o,q)z(o)  +  G(a,q)}do. 

0 

Furthermore  y(t;q)€  Z  is  continuous  in  t  €  [0,T]  and  uniformly  continuous  in 
2 

4>  e  Hq  (in  the  Z  topology), uniformly  in  te  [0,T]  and  q  €  Q. 


Proof:  That  t  -*■  y(t;q)  satisfies  (2.9)  follows  immediately  from  the  definitions 

of  the  operators  involved  and  the  comments  following  (HI ) - ( H4 ) .  The  arguments 
required  to  establish  uniqueness  are  similar  to  those  for  continuous  dependence, 
so  we  present  only  the  estimates  that  prove  the  latter. 

'Xi  3 

If  z  and  z  denote  solutions  to  (2.9)  associated  with  0  and  0  (in  Hq) 


respectively,  then 


[I] 


(2.10)  z(t)  -  z(t)  ■♦-♦+/  {A(a,q)z(a)  -  A(o,q)z(c) }dc  . 

0 

We  require  an  integral  version  of  the  well-known  result  [14,  p.  100]  that 

^  | x( t)  1 2  =  <x(t),x(t)>.  If  X  is  a  Hilbert  space  and  if  x:  [a,b]  -*■  X  is 

given  by  x(t)  =  x(a)  +  /  v(a)do,  then 

0 

(2.11)  |x(t)|2  =  t xCa)  [ ^  +  2  /  <  v(o)  ,x(cr)>da  . 

0 

In  our  case  we  find,  using  (2.10)  and  this  result,  that 

|z(t)  -  *(t)|2  =  |«  “  £|2  +  2  /t<A(a,q)(z(o)  -  z(a)),z(a)  -  2(a) > da 


1  U  “  $|2  +  2u)  /  |z(a)  -  z(c)|2do  , 
0 


where  we  have  used  the  dissipative  property  for  A(a,q)  that  is  uniform  in  a  and 
q.  Finally,  an  application  of  Gronwall's  inequality  yields 

|z(t)  -  z( t)  | 2  <  U  -  $|2  exp(2a)T)  , 

and  the  desired  continuity  result  (uniform  in  t  and  q)  obtains. 


In  view  of  the  established  equivalence,  the  parameter  estimation  problem 
involving  (2.4)  may  be  reformulated  as  an  abstract  identification  problem, 
where  we  now  wish  to  find  q  e  Q  that  minimizes 


(2.12)  J(q)  =  y  y.  -  C(t . ,q)z(t .  ;q) 


where  z  is  the  solution  to  (2.9)  corresponding  to  q£  Q,  C(t,q)  :  Z  -*•  Z  is 
defined  by  C(t,qh  =  Y(t ,  -  ,q(t ,  •  ))<|>,  for  *€  Z,  and  |*|  is  the  usual  L2(0,1) 
norm. 

Theorem  2.2.  There  is  a  solution  q  in  Q  to  the  parameter  estimation  problem 
for  (2.9),  (2.12)  (equivalently,  (2.1)-(2.3),  (2.4)). 

Proof:  In  the  arguments  that  follow,  we  shall  show  that  the  map  q  ■+  z(t;q)  is 

continuous  for  each  t  e  (0,T),  so  that  the  continuity  of  q  -*•  J(q)  is  assured; 
therefore,  there  is  a  parameter  q  in  the  compact  set  Q  for  which  J  attains 
its  minimum. 

Let  q,  q  be  given  in  Q  and  let  z,  z  be  the  corresponding  solutions  to 
(2.9).  Applying  (2.11)  we  find  that 

1  z ( "t )  -  z(t)|2=  2  /<(A(o,q)  -  A(a,q))z(a)  ,z(o)  -  z(a)>da 
0 

+  2  /  <A(o,q)(z(a)  -  z(a)),z(a)  -  z(a)>da 
0 

t 

+2  /  <G(a,q)  -  G(a,q),z(a)  -  z(a)>da 
0 

<  2  n(A(a,q)  -  A(a,q))z(a)|  |z (a)  -  z(o)|do 
0 

t  p 

+  2 a)  J  |  z(u)  -  z(o)  |  da 
0 

+  2  /  |G(o,q)  -  G(a,q)|jz(a)  -  z(a)|da 
0 

<.  /  |(A(a,q)  -  A(a,q))z(a)|2do 
0 

T  7 

+  /  |G(a,q)  -  G(a,q) [  da 
0 

T  7 

+  2(o)+l )  /  |  z(a)  -  Z(a)rda, 

0 
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i 

[i 

t 


where  dissipativeness  for  A(o,q)  has  been  used,  along  with  the  inequality 
2ab  _<  a  +  b  that  holds  for  a  and  b  real  numbers.  Applying  Gronwall's 
inequality  to  the  last  expression  we  see  that  the  continuity  result  depends  on 
estimates  for  the  first  term  (that  the  second  term  involving  G  may  be  made 
arbitrarily  small  is  immediate  from  (H3 ) ) .  To  this  end,  we  observe  that 

(2.13)  /Tj(A(a,q)  -  A(a  ,q)  )z(o )  1 2do  <  2  /  [(q^a,*)  -  ^  (a  ,• )  )D^z(a)  1 Zda 
0  0 

T  o 

+  4  /  |  (q?(a ,  • )  -  q?(o,.))Dz(a)rdo 
0  c 

+  4  /  |(q1(o,»)  -  q-,(a,* )  )z(o)  j2do 
0  J  J 

<  4(w2  +  w2  +  w2)  I  q-q  I  2 


where  in  the  last  expression  |*|  denotes  the  l_2(U)  x  ...  x  L^dJ)  norm  and 
N.  =  sup  { |  (D1z(o))(x)|  :  o  e  (0,T),X€  (0,1)}  is  finite,  i  =  0,1,2,  (see  the 
comments  following  (HI )- (H4)  above).  The  continuity  of  J  and  thus  the  existence 
results  are  established. 


We  remark  that  the  important  aspect  cf  the  above  result  is  the  continuous 
dependence  of  z  on  q  which  is  needed  to  establish  existence  of  solutions  to  the 
approximate  estimation  problems  formul ated  in  the  next  section  (see  Theorem  3.2) 
Indeed,  existence  of  solutions  to  our  original  estimation  problem  follows  from 
the  convergence  results  for  sequences  of  solutions  to  the  approximate  problems 
(see  Theorem  3.5). 


I 


§3  Approximate  parameter  estimation  problems 

We  focus  in  this  section  on  the  problem  of  approximating  the  infinite¬ 
dimensional  estimation  problem  for  (2.9),  (2.12)  by  a  sequence  of  parameter 
estimation  problems  for  which  the  state  variable  satisfies  an  ordinary  differential 
equation  (ODE)  on  a  finite-dimensional  state  space  Z  .  Fundamental  to  this 

undertaking  is  the  task  of  establishing  the  convergence  of  solutions  of  the 

N 

approximating  systems  on  Z  to  solutions  of  the  original  AEE  on  Z.  Although  our 
formulation  is  a  classical  one  of  the  Ritz  type  (involving  orthogonal  projections 
of  an  infinite-dimensional  system  onto  a  sequence  of  finite-dimensional  sub¬ 
spaces)  that  is  modified  to  allow  for  q-dependent  operators  and  variables,  our 

calculations  are  much  simpler  than  those  for  functional  differential  equations 

N 

in  [4],  [9]  and  [16],  where  the  spaces  Z  and  Z  as  well  depend  on  the  choice 


For  the  approximation  spaces  Z  ,  we  take  the  spans  of  cubic  spline  basis 

elements  which  have  been  modified  so  as  to  satisfy  the  homogeneous  boundary 

N 

conditions.  Specifically,  for  any  integer  N  >  0,  let  x^.  =  j/N,  j  =  -3,...,N+3, 

N  n  N 

and  Sj,  j  =  -1 , . . .  ,N+1 ,  be  the  cubic  spline  that  vanishes  outside  (*j_2>  xj+2'* 

N  M 

has  value  4  and  slope  0  at  x.,  value  1  and  slope  3N  at  x'._ ■> ,  and  value  1  and 

3  J 

N  N 

slope  -3N  at  Xj+-j  .  Then  is  the  standard  cubic  spline  found  in  the  literature 
(see,  for  example,  [27,  p.  73]  where  the  basis  elements  differ  from  the  ones 
defined  here  by  a  factor  of  24).  We  note  that  for  some  values  of  j  we  do  not 


have  S'-  satisfying  homogeneous  boundary  conditions.  The  modified  basis, 

J 

N 

elements  B-  considered  here  will  be  the  restriction  to  [0,1]  of  the  following 
functions: 
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rN  _  ,N  .  N 
B0  '  S0  '  4S-1 


2^  =  Sq  -  4S^ 

=  SN,  j  =  2,...,N-2 
J  J 

BN  -  SN  -  4SN 

Vl  N  ^N-l 
bn  _  j\  4Sn 

BN  "  SN  4SN+1  ' 

N  N  N  N 

The  approximating  subspaces  Z  of  Z  are  then  given  by  Z  =  span{BQ, . . . ,BN> , 

N 

for  which  it  is  easy  to  see  that  Z  C  V  for  each  N  =  1,2,...  . 

Our  intermediate  goal  is  to  find  approximations,  for  fixed  q  &  Q,  to  the 
solution  z  on  [t^T]  of 

t 

(3.1)  z(t;q)  =  z(t,;q)  +  /  {A(a,q)z(a;q)  +  G(a,q)}da, 

*1 

which  is  equivalent  to  (2.9)  if  z(t^;q)  is  given  by  (2.9)  for  time  t^  >  0  (the 
first  sampling  time).  To  this  end  we  define,  for  t  C  [t^,T] 

(3.2)  ZN(t;q)  =  P^t^q)  +  J  {AN(o,q)zN(o;q)  +  PNG(a,q)}da, 

*1 

where  PN  :  Z  -+  ZM  is  the  orthogonal  projection  characterized  by  <PNz  -  z,  B^>  =  0, 
j  =  0,1 ....  ,N,  and  Afl  :  Z  -+  ZN  is  defined  by  AN(t,q)  =  PNA(t,q)PN.  Since 
zN(t)  in  (3.2)  belongs  to  ZN,  a  finite-dimensional  space,  equation  (3.2)  is 


equivalent  to  the  ODE  on  [t^,T] 


(3.3) 


z  (t;q)  =  AN(t,q)zN(t;q)  +  PNG(t,q)  , 
zN(t1;q)  =  PNz(t1;q) 


1 
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which,  as  we  shall  show  in  the  arguments  to  follow,  approximates  (3.1)  in  some 
sense.  We  remark  here  that  although  <t>  does  not  explicitly  appear  in  (3.2) 
and  (3.3),  zN  depends  on  4  in  that  zN(t^)  was  generated  using  z(t-j)  from  (2.9) 
with  z(0)  =  <j>. 

Associated  with  (3.2)  (or  (3.3))  is  an  "approximate  parameter  estimation 
-N 

problem":  Find  q  £  Q  that  minimizes 

(3.4)  JN(q)  =  l  |y .  -  C ( t - ; q ) zN ( t . ; q ) | 2 
i=l  1  1  1 

N 

where  z  is  the  solution  to  (3.3)  corresponding  to  q  £  Q. 

N 

Before  discussing  the  convergence  of  z  (t)  to  z(t)  as  N  +  »,  we  establish 

analogues  of  Lemma  2.1  and  Theorem  2.1  for  the  operator  and  the  ODE 

(3.3),  respectively.  Our  first  result  is  a  dissipative  property  for  AN  that 

N  N 

follows  immediately  from  the  same  property  for  A;  that  is,  for  any  ip  £  Z  , 

A*  „\,N  ,N  DN « / .  ,J,N  N 
<A  (t,qH  ,\|»  >  =  <P  A(t,q)P  ip  ,ip  > 

-  a/*  dM 

=  <A(t,q)P  ip  ,  P  ip  > 


<  w  |  ip 


N.2 


N 

where  we  have  used  the  fact  that  P  is  self-adjoint.  We  summarize  this  finding 
in  the  following. 

2 

M  N  M 

Lemma  3.1.  Let  u>  =  —  +  M  >  0.  Then,  for  every  \p  £  Z  , 


/  o  A.  „  \  ,N  ,N  ,  N,  2 

(3.5)  <A  (t,q)ij;,ij;  >  <_  u  iJj  i 


for  all  t  £  [0 , T]  and  q  £  Q,  and  for  each  N  =  1,2,...  . 
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t  . 


1 

L 


Our  next  result  guarantees  the  existence  of  solutions  to  (3.3)  as  well  as 
to  the  Nth  parameter  estimation  problem.  Furthermore,  in  the  proof  of  the 
former  we  outline  the  numerical  scheme  used  to  solve  (3.3)  for  various  sample 
problems  reported  in  section  4. 

N 

Theorem  3.1.  Let  q  be  fixed  in  Q.  Then  there  exists  a  unique  solution  z  to 

3 

(3.3)  on  [t^,T]  that  depends  continuously  on  <p  €  Hq  (in  the  L2(0,1)  topology), 
uniformly  in  t  €  [t^,T],  q  e  Q,  and  N  =  1,2,...  . 

N 

Proof:  Our  arguments  are  similar  to  those  developed  in  [9],  [11],  where  z  (t) 

M  M 

is  written  in  terms  of  the  basis  elements  {B . >  for  Z  . 

J 

M 

Define  8  as  the  1  *  (N+l)  row  vector  function  given  by 


(Bg,...,B[J). 


For  each  t  e[t^,T],  zN(t)  £  ZN,  so  that  there  exists  wN(t)e  RN+1,  wN(t)  = 
coINqU),.  ..  ,w|[!j(t))  such  that  zN(t)  =  BNwN(t).  Thus  we  may  rewrite  (3.3)  as 


(3.6) 


'  N.N,.,  .N/  N  N,  +  ,  .  0N_N/  +  » 

6  w  (t)  =  A  (t,q)6  w  (t)  +  8  9  (t). 


.N  N/.  x  N  N 
8  w  (tj)  =  8  ? 


te  [trT], 


where  ^  and  gN(t)  in  RN+^  are  defined  by  P^z(t^)  =  8N^  and  PNG(t)  =  8%^(t); 

N  N  N  N 

it  is  understood  that  w  ,  g  ,  and  x,  depend  on  q  6  Q.  Let  A  (t,q)  denote  the 

N 

matrix  representation  of  A  (t,q)  with  respect  to  the  basis  elements 
N  N 

{Bq,...,B^}.  Then  usual  Galerkin  arguments  establish  that  the  coefficients 
w^(t)  in  (3.6)  satisfy 


(3.7) 


fiV)  =  AN(t,q)wN(t)  +  gN(t) , 


t  e  [t1 , t]  , 


4 
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so  that  once  we  determine  A  (t,q)  we  will  be  able  to  analyze  the  existence 
and  uniqueness  of  solutions  to  (3.7);  in  addition,  the  realization  of  numerical 
algorithms  to  solve  our  approximate  estimation  problem  will  be  based  on  (3.7). 
Note  that  for  any  ip  £  Z,  (p\  -  ip)  €  (Z^)^-  so  that  for  each  j  =  0,...,N, 


-  \p,  B^>  =  0  , 

J 

or,  equivalently, 

(3.8)  <ip,  B*!>  =  <PNt/i,  BN>  . 

J  J 

N  N  fi  T  N+l 

However,  since  there  exists  e,  =  (Cq . C^)  €  R  such  that 

(3.9)  P%  =  bV  , 


equation  (3.8)  can  be  written 


DN  „NrN  dN 
<4>,  Bj>  =  <6  C  ,  B^> 


V  dn  bn  n  .  n 

l  <B • ,  B->  £.,  J  =  o,.. 

i=0  J 


,N. 


This  may  be  written  as  a  matrix  equation 


HN(*)  =  qV 


where 


H%)  * 


,  dn 

<*,  Bq> 
«P,  Bn> 


and  the  (N+l)  square  matrix  Q  has  elements  Q.  ■ 

'  »J 


U  N 

<B:,  Bj> ,  i , j  =  0,1 ,, 


•  >N. 
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It  follows  that  we  may  solve  for  £  by  writing 

(3.10)  £N  =  (qVW)  , 

N 

where  Q  is  invertible  since  it  is  the  Grammian  of  linearly  independent  basis 
elements  for  Z  ,  The  desired  representation  for  A  (t,q)  may  be  derived  from 
these  estimates  since 

AN(t,q)zN(t)  =  PNA(t,q)8NwN(t) 

•  PN[q1(t,-)D2(6NwN(t))  +  q2(t,.)D(eV(t))+q3(t,-)eNwN(t)] 

N  N/+\ 

=  6  a  (t) 

for  some  aN(t)  £  RN+\  Thus,  applying  (3.9)  and  (3.10),  we  find 

aV)  =  (QN)"1HN(q1(t,-)02(6NwN(t))  +  q2(t,.)D(6NwN(t))+q3(t,.)8NwN(t)) 


w  - 1 

=  (QN) 


r  N, . ,  *n2DN  nN 

I  w.(t)<q,(t,*)D  B.,  Bn> 

j=0  J  1  J  J 


r  N/  . ,  ,.  v n2„N  dN 

l  w,(t)<q,(t,-)D  B.,  Bm> 

j=0  J  1  J  N 


+ 


l  wrJ(t)<q2(t,-)DBrJ,  Bq> 
j  =  0  3  3 


l  w.J(t)<q?(t,-)DB^,  Bjj> 
j=0  3  ^  3  U 
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,  I  w^(t)  <q3(t,.)B'?,  bJ!> 

/  j=0  J  J  o  0 

+  «N>''  j 

\  J  w^(t)  <q3(t,-)Bl  bS 
j=0  J  J 

=  CQN)“  ‘(^(q)  +  K^q)  +  K^(q))wN(t)  . 

Here  the  (N+1)-square  matrices  K^,  K^,  and  have  elements  given  by 
K1(q)i,j  =  B^>  , 

K2(q)i,j  =  <d2(t’^DBj’  B?>  * 

K3(<>)i,r  B?>  • 

Hence  we  may  conclude  that  (3.7)  may  be  rewritten  as  a  linear  (N+l)-vector 
ODE  in  wN(t)  for  t  €  [t^,T], 

wN(t)  =  (QN)  1 ( K ^ ( q )  +  Kg(q)  +  K3(q))wN(t)  +  gN(t), 

(3.11) 

N,.  x  N 
|^w  (t1 )  =  z  • 

We  may  then  apply  standard  differential  equation  theory  to  obtain  the  existence 

of  a  unique  solution  of  (3.11)  on  [t-,,T]  that  depends  continuously  on  ?  . 

< 

Moreover,  since 

tN  =  (QN)'1HN(z(t1))  , 


where  the  well-defined  map  $  -*■  z(t^)  is  continuous,  it  follows  that  the 
solution  to  (3.3)  given  by 

zN(t)  =  8NwN(t), 

O 

is  unique  on  [t^.T]  and  continuously  dependent  on  $  G  Hq.  We  have  only  to 
show  that  the  continuity  is  uniform  in  t  €  [t^,T],  q£Q,  and  N  =  1,2,...  . 
This  follows  easily  from  the  (uniform)  dissipative  result  for  A  ,  as  the 
required  arguments  are  virtually  the  same  as  those  used  to  demonstrate  the 
continuity  (uniform  in  t,q)  of  <j>  -+  z(t;q)  (found  in  the  proof  of  Theorem  2.1). 

Finally,  minor  modifications  in  the  proof  of  Theorem  2.2  yield  the 
following.  (Note  we  use  here  the  continuity  of  q  -*■  z(t-.;q).) 

t  h 

Theorem  3.2:  The  N  approximate  parameter  estimation  problem  has  a  solution 
qN  €  Q  for  each  N  =  1,2,...  . 

We  may,  therefore,  determine  a  solution  qN  to  the  Nth  identification 

problem  by  applying  conventional  optimization  techniques  with  computational 

schemes  that  solve  (3.11)  (the  ODE  in  the  "Fourier"  coefficients  w^(t)). 

Although  we  may  be  able  to  determine  fairly  easily  in  the  ODE-governed 

problem  (especially  for  small  values  of  N),  the  solution  q^  we  find  is 

-N 

meaningful  only  if  q  approximates  the  desired  solution  q  to  the  original 

estimation  problem.  Fundamental  to  the  establishment  of  this  fact  (i.e., 

-N 

the  convergence  of  q  to  q  in  some  sense)  is  the  demonstration  that 
z  (t;q^)  -*•  z(t;q)  for  any  sequence  {q^}  of  parameters  in  Q  that  converges  to 
some  q  €  Q. 


To  pursue  this  line  of  argument,  we  shall  therefore  assume  that  an 
N 

arbitrary  sequence  {q  }  of  parameter  functions  has  been  given  such  that 

qN  -*■  q  in  the  L2(U)  *  . . .  *  L2(U)  topology,  where  q  and  qN  £  Q,  N  =  1,2,...  . 

N  N 

We  first  consider  the  problem  of  obtaining  the  convergence  of  z  (t;q  )  to 

^  3 

z(t;q)  for  initial  data  $  in  a  smooth  but  dense  subset  of  Hq,  a  restriction 

that  simplifies  our  calculations  since  we  are  able  to  take  advantage  of 

several  useful  spline  estimates  (given  below). 

Let  Oj  denote  the  subspace  of  Z  given  by  ^  H  Hq.  Then  for  ^ 

we  obtain  the  spline  estimates  presented  in  Lemma  3.2  below,  the  proof  of 

which  may  be  found  in  [7,  Lemma  2.3]  (the  arguments  use  standard  techniques 

from  spline  analysis  such  as  those  underlying  Theorem  6.13  of  [27]). 

Lemma  3.2  Let  <p  be  given  in  Z.  Then 

C 

(3.12)  \P%  -  *1  <  -Sr  |D%!. 

rr 

c 

(3.13)  | D(PfV  -  <i»)|  1-4  |D%| ,  and 

NJ 

C 

(3.14)  [  D2(PNip  -  *)|  <  -f  |D%!, 

hr 

where  Cq,  ,  C2  are  constants  independent  of  ip  and  N. 

We  turn  now  to  a  rather  technical  result  that  facilitates  later  convergence 
proofs.  To  argue  that  zN(t;qN)  -+  z(t;q),  we  shall  need  to  use  estimates  such 
as  (3. 12)-(3. 14)  where  ip  -  z(t;q)  and  hence  need  conditions  under  which 
z( t;cf)  e^.  This  can  be  guaranteed  if  we  suitably  restrict  the  initial  data 
for  z.  To  this  end  we  define  I  =  H  OHq  and  note  that  I  is  dense  in  the 

3 

initial  data  set  Hn  (in  the  L?  topology).  We  can  then  prove  the  following. 
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Lemma  3.3  Suppose  q  £  Q  is  fixed  and  let  4>  be  given  in  I.  Then  the  solution 
z  to  (3.1)  corresponding  to  the  initial  data  $  is  such  that  z(t;q)  S  ^  for 
each  t  €  [t1 ,t  ]. 


Proof:  We  shall  need  certain  results  from  Chapter  3  of  [18].  From  (HI)  and 


the  definition  of  Q  it  follows  that  for  i  =  l,...,n,  and  s  =  0,1,2, 


3sq. 


3X 


-,  |f,  and  (where  g(t,x)  =  f(t,x,q(t,x)))  are  uniformly  Holder  con- 


ax"  3t  3X" 

tinuous  functions  on  U  with  exponents  in  (0,1).  Pick  a£  (0,1)  as  the  maximum 
of  these  exponents  so  that  each  function  is  uniformly  Holder  continuous  with 
exponent  a. 

5  s 

If  $  €  I  is  given,  <j>  belongs  to  C  (0,1)  so  that,  for  s  =  0,...,4,  D  <f>  has 

a  bounded  (continuous)  derivative  on  (0,1).  It  follows  immediately  that  Ds<fi, 

s  =  0, . . . ,4,  are  uniformly  Holder  continuous  on  (0,1)  with  exponent  a.  In 

3  2 

addition,  <t>  e  I  implies  that  <f>  €  HQ  so  that  (D  <f>)(x)  =  (D^)(x)  =  4>(x)  =  0  for 
x  =  0,1.  Therefore,  for  x  =  0,1,  and  t  =  0, 


A(t,q)<|>(x)  +  G(t,q)(x)  =  0  , 

where  we  have  used  hypothesis  (H2)  to  set  the  second  term  equal  to  zero. 

We  note  that  this  last  condition  is  just  the  condition  "L^  =  f  on  3B"  in  the 
notation  of  [18,  p.  75]. 

Standard  regularity  theorems  for  parabolic  equations  may  now  be  applied: 
Specifically,  one  may  use  Corollary  1  of  [18,  p.  78]  (with  p  =  1  in  the 
notation  of  [18])  to  make  minor  modifications  in  the  arguments  underlying 
Theorem  12,  p.  75,  of  the  same  reference  (with  p  =  2  for  the  spatial  derivatives) 
and  conclude  that  D^z(t;q)  is  uniformly  Holder  continuous  on  (0,1)  (with 


exponent  a)  for  every  t  €  [t^,tr]. 


23 


The  preceding  results  may  now  be  employed  to  demonstrate  convergence  of 
zN( t ;qN )  to  z(t;q)  whenever  z(t;q)e^.;  i.e.,  when  <j>£  I  and  te  [t-j  * tr] - 

N 

Theorem  3.3:  Let  4  £  I  be  given.  Suppose  {q  }  is  arbitrary  in  Q  with 
q^  -*■  q  and  q  £  Q.  Then 

zN(t;qN)  -*■  z(t;q) 

as  N  -+  ®,  uniformly  in  t  £  [t^,tr]. 

Proof:  Where  no  confusion  results,  we  shall  let  zN(t)  and  z(t)  represent  the 

solutions  to  (3.2)  and  (3.1)  corresponding  to  q^1  and  q,  respectively.  Then 
for  t  e  [t1,tr], 

(3.15)  |zN(t)  -  2(t)|  <  |zN(t)  -  PNz(t)|  ♦  |PNz(t)  -  z(t)| 

where  the  second  term  converges  to  zero  as  N  ->■  ®  from  (3.12)  and  Lemma  3.3; 
in  fact,  convergence  is  uniform  in  t  due  to  the  compactness  of  {z(t)|t€  [t-|,tr]} 
in  Z.  It  remains  to  consider  |z^(t)  -  PNz( t) | .  From  (3.1),  (3.2)  and  (2.11) 
we  have 

|z\t)  -  PNz(t)l2  =  | PNz(t1  ;qN)  -  PNz(t1;q)|2 

+  2  /  <AN(o ,qN) (zN(o)  -  P^z(a) ) ,  zN(a )  -  PNz(a )>da 
tl 

+  2  /t  <AN(a,qN)PNz(a)  -  PNA(o,q)z(o),  zN(a)  -  PNz(o)>do 
tl 

+  2  l  <PNG(o,qN)  -  PNG(a,q),  zN(a)  -  PNz(c)>do  . 
t. 


I 
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Appealing  to  arguments  similar  to  those  found  in  the  proof  of  Theorem  2.2,  we 
thus  argue 

(3.16)  |zN(t)  -  PNz(t)  |  2  <  T-j  (N )  +  T2(N)  +  T3(N)  +  2(«+l)  /trfzN(a)  -  PNz(o)|2do  , 

41 

where 

T1  (N)  =  J  z(t-j  ;qN)  -  z(t7  ;q)  | 2  , 


To(N)  =  /  r|AN(a,qN)PNz(a)  -  PNA(a,q)z(a) | 2do  , 
*1 


It  remains  only  to  verify  that  T.(N)  -*•  0  as  N  ,  i  =  1,2,3;  an  application 
of  the  Gronwall  inequality  to  (3.16)  will  then  give  the  desired  convergence  of 
| zN( t)  -  PNz ( t ) j  to  zero  as  N  >  ®,  uniformly  in  t  €  [t^,tr]. 

The  convergence  of  T-j (N )  to  zero  is  an  immediate  consequence  of  the  proof 
of  Theorem  2.2.  In  addition,  T^(N)  0  from  (H3).  Further,  T3(N)  £  2x^(N)  + 

2 T2 ( N )  where 

t,(N)  =  ftr  |AN(a,qN)PNz(a)  -  PNA(c,qN)z(a) | 2da 

h 

and 

t7(N)  =  /  r  |PNA(a,qN)z(a)  -  PNA(a,q)z(a) | 2da 
*1 

converge  to  zero  as  N  +  ®,  (the  argument  for  follows  the  reasoning  of  (2.13)). 
Considering  ,  we  first  observe  that  the  integrand  satisfies 
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l(PNA(o,qN)PN)PNz(<,)  -  PNA(o,qN)z(o)| 

<  |AC»,qN)(PNz(o)  -  z(q))| 

i  lq"(°,-)D2(PNz(0)  -  z(o))l  +  iq”(CT,-)D(PNl(o>  -  z(o))[ 
*  |q”(«,-KPNz(a)  -  z(o))| 


ih(VVi)  l»4*(«4)l 


N,  , 

=  0  (o) 


from  (3.12),  (3.13),  (3.14)  and  the  definition  of  Q.  Furthermore,  the  con- 


vergence  of  0  (a)  -*•  0  as  N  ->  «  is  dc»ninated  since 
e'^a)  _<  M(Cq  +  C-|  +  C2)jD4z(a;q)|  , 


» 


where  the  map  a  ->  |D  z(a;q)|  is  in  L2(t-|,tr)  since  $  €  I  (the  arguments  are 
similar  to  those  used  to  prove  Lemma  3.3;  i.e.,  the  map  (a,x)  -*■  D4z(o;q)(x) 
is  uniformly  Holder  continuous  on  (t^,tp)  *  (0,1). 


Finally,  we  are  able  to  derive  state  variable  convergence  for  arbitrary 

<t>  e  Hq. 

Theorem  3.4:  Suppose  that  q^  -*■  q  where  q^  and  q  are  arbitrary  in  Q.  Then 

3 

for  any  $  e  HQ, 

zN(t;qN)  -  z(t;q) 


as  N  ■*  ,  uniformly  in  t  £  [t.|  ,trJ. 


I 
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3 

Proof:  Since  I  is  dense  in  Hq,  an  element  <p  may  be  chosen  from  I  so  that 

the  quantities  jzN(t;<j>,q)  -  zN(t;$,q)|  and  |z(t;4>,q)  -  z(t;<fr,q)|  are  as  small 
as  desired  (uniformly  in  t,  q,  and  N;  see  Theorems  2.1  and  3.1).  Therefore, 
for  any  t  €[t^,tr3,  we  find  that 

|zN(t;4>,qN)  -  z(t;<fr,q)| 

<  |zN(t^,qN)  -  zN(f,;,qN)S 

+  |zN(t;4>,qN)  -  z(t;4>,q)  ] 

+  | z(t;<j>,q)  -  z(t;<|>,q) | 

will  be  arbitrarily  small  for  N  sufficiently  large.  The  convergence  result 
thus  obtains. 


Remark  3.1.  If  the  perturbing  function  f  is  independent  of  q,  it  is  useful 
to  note  that  we  may  relax  the  spatial  differentiability  requirements  on  f 
(given  in  (HI))  in  much  the  same  way  that  this  was  done  for  <j>  in  Theorem  3.4 
above.  That  is,  we  may  replace  (H1)~(H3)  by  (HI)'  and  (H2)  ,  given  below, 


and  obtain  the  initial  convergence  results  (Theorem  3.3)  for  (<j> ,f }  in  a  smooth 

3 

but  dense  subset  of  Hq  x  ( U) ;  as  before,  the  additional  smoothness  is  needed 


to  guarantee  that  the  corresponding  solution  z(t;<j>,f,q)  belongs  to 


3 


for 


t  e  [t-j,tr].  Under  the  new  hypotheses  on  f,  existence  and  uniqueness  of 


solutions  to  the  original  parabolic  equation  are  still  assured. 


I  3  f 

(HI)  The  perturbation  function  (t,x)  -»  f(t,x)  and  its  derivative 

a  U 

are  uniformly  Holder  continuous  on  U  with  exponent  in  (0,1). 


(H2)  The  function  f  satisfies  f(0,0)  =  f(0,l)  =  0. 


We  note,  however,  that  a  price  must  be  paid  for  this  formulation  of  the  problem: 
When  q^  $  0  or  q^  f  0,  nonhomogeneous  boundary  conditions  can  no  longer  be 
allowed  since  the  transformation  to  homogeneous  boundary  conditions  (see  (2.5), 
(2.6))  necessarily  generates  a  q-dependent  perturbing  function. 

We  should  also  remark  on  another  technical  aspect  of  the  above  presentation. 
The  reader  might  be  curious  as  to  why  we  choose  the  first  sampling  time  t-| 
strictly  positive  and  carry  out  the  convergence  proofs  (see  (3.1),  (3.2)  and 
Theorem  3.4)  on  [t-j,T]  instead  of  the  (perhaps)  more  natural  interval  [0,T]. 

This  is  a  purely  technical  matter,  since  if  we  take  t^  =  0  the  establishment 
of  the  fundamental  results  of  Lemma  3.3  becomes  more  delicate.  In  particular, 
we  can  then  no  longer  use  Corollary  1  of  [18,  p.  78]  as  we  did  in  the  proof 
of  that  lemma.  However,  if  we  make  further  smoothness  assumptions  on  the  data 
of  our  problem  (e.g.,  the  coefficients,  initial  data,  nonhomogeneous  perturbing 
function  f),  we  can  instead  invoke  Corollary  2  of  [18,  p.  78]  to  obtain  the 
desired  bounds  and  convergence  results  on  [0,T]  x  [0,1]  in  place  of  [t^,T]  x 
[0,1]  where  t^  >  0.  In  our  calculations  in  section  4  below,  we  actually 
approximate  the  functional  coefficients  on  [0,T]  and  exhibit  the  convergence 
in  this  case. 

The  smoothness  assumptions  that  we  have  assumed  on  the  coefficients  are 
reasonable  unless  one  is  dealing  with  problems  in  which  the  coefficients  have 
jump  discontinuities.  Our  methods  also  perform  well  in  such  situations  (this 
comment  is  based  on  use  of  our  techniques  in  numerical  calculations  for  other 
classes  of  estimation  problems),  but  the  above  theory  does  not  extend  in  a 
straightforward  manner.  Rather,  one  can  use  a  weak  formulation  of  the 
system  (2. 1)— (2.3)  (e.g.,  see  [10])  to  develop  a  convergence  theory  in  this  case. 


j 
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N 

To  this  point  we  have  focused  on  the  convergence  of  solutions  z  (of 
(3.3))  to  the  solution  z  (of  (3.1))  once  the  convergence  of  any  sequence  of 
parameters  has  been  assured.  In  reality  however,  we  have  yet  to  determine 
whether  any  sequence  of  solutions  (q^}  of  the  approximating  problems  is  in  fact 
convergent;  even  then  we  have  no  guarantee  that  the  limiting  function  q  is 
indeed  a  solution  to  the  original  parameter  estimation  problem.  Our  next 
result,  (similar  in  spirit  to  that  found  in  [4],  [9],  [13]),  addresses  this 
question  and  indicates  when  an  approximate  identification  problem  may  be  used 
to  compute  numerical  solutions  for  the  original  problem. 

-W  _w 

Theorem  3.5.  Let  {q  }  be  given,  where  each  q  is  a  solution  to  the  approximate 

parameter  identification  problem  for  (3.3),  (3.4).  Then  there  exists  q£Q 

-Nk  -Nfr  - 

and  a  subsequence  {q  }  such  that  q  -»•  q  and  q  is  a  solution  to  the  original 
estimation  problem  for  (2.9),  (2.12). 


Proof:  Since  Q  is  compact,  convergence  of  a  subsequence  {q  *}  to  some  q  in 

Q  is  ensured.  In  fact,  it  is  easy  to  see  that  q  is  a  solution  to  the  original 
parameter  estimation  problem.  From  (2.12), 


J(q)  *  .I  |yi  -  C(ti,q)z(ti;q) 


-  I  \it  -  c(vqVk(v5Nk>l2 

N.--  1  =  1  1  1  1 

k 


=  lim  J  (q  ) 

N. -*« 
k 

<  lim  J  K(q) 


29 


for  any  q  €  Q.  But  Theorem  3.4  is  also  valid  with  the  constant  sequence  {q}, 

N 

so  that  z  (t;q)  -*•  z(t;q)  uniformly  in  t  e[t-|,tr];  we  are  thus  guaranteed  that 
Nk 

lim  J  K(q)  =  J(q) 

V" 

so  that  J(q)  _<  J(q)  for  any  q£  Q, 


In  the  discussions  thus  far,  our  problems  have  been  formulated  in  terms 
of  iterations  and  searches  in  the  parameter  function  space  Q.  Of  course,  we 
cannot  realize  such  searches  on  a  computer.  Indeed  we  must  have  for  the  elements 
in  Q  some  type  of  representation  amenable  to  implementation.  One  possibility 
(see  [12])  is  to  assume  an  a  priori  functional  representation  (e.g.,  polynomials 
of  degree  <_  k  with  coefficients  ranging  over  some  given  set)  so  that  the  parameters 
sought  and  convergence  argued  actually  involve  finitely  many  constants  in  some 
fixed  subsets  of  Eucl idean  space.  For  this  approach  one  must  have  an  idea  of 
the  form  (shape)  of  the  unknown  functional  coefficients.  In  this  event  the 
above-developed  theory  obviously  suffices  to  establish  convergence  results 
and  an  implementable  scheme.  However  our  theoretical  framework  is,  in 
reality,  much  more  general  and  can  be  employed  to  develop  methods  where  one 
actually  searches  for  the  shape  of  functional  coefficients  by  seeking  functional 
approximations  (say  in  the  space  of  linear  or  cubic  splines).  We  do  this  in 
the  examples  of  section  4  below.  To  illustrate  how  the  above  theory  can  be 
applicable  in  this  situation  (we  recall  that  all  the  results  are  given  for  Q 
compact  in  the  sense  and  that  the  continuous  dependence  of  z  and  J  on  q  is 
in  the  norm),  we  first  consider  the  case  where  one  uses  piecewise  linear 
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spline  approximations  to  functions  in  Q  (which  is  natural  if  Q  is  compact 

in  the  C  or  supremum  norm  topology). 

We  assume  that  Q  C  C  is  compact  in  the  C  topology  and  let  L^M, )  and 

L  (M-)  denote  the  spaces  of  piecewise  linear  splines  (see  [27,  p.  11]) 

T-t, 

corresponding  to  equal  partitions  of  [t^,T]  and  [0,1]  of  mesh  size  -j^ —  and 
,  respectively.  That  is,  we  use  the  piecewise  linear  splines  with  knots 

tk=  t1  +  (k-l)((T-t1)/M1),  k  =  1,2, —  ,M-|+1 ,  and  xm  =  (m-l)/M2,  m  =  1 ,2 . M2+l 

We  denote  the  corresponding  2-dimensional  interpolating  linear  spline  operator 
M 

by  I  where  M  =  (M, ,M2).  Thus  if  q  is  continuous  on  [t-|,T]  *  [0,1], 

M-,+1  M2+l 

(IMq)(t,x)  =  l  l  q(tk,x  )8k(t)p  (x) 
k=1  m=l  k  m  k  m 

M1  M2 

where  b.  =  3.  and  p  =  p  are  the  appropriately  normalized  (depending  of 

l\  ill  III 

course  on  the  partition  mesh)  usual  "patch”  function  piecewise  linear  basis 
elements  [27,  p.  11]  defined  on  the  intervals  [t^T]  and  [0,1],  respectively. 
Clearly  the  mapping  :  Q  -  C  is  continuous  (in  the  C  topology  on  Q)  and 
hence  QM  =  IM(Q)  is  compact  in  C.  Since  any  q  =  (q^ ,q2, . . .  ,q  )  in  QM  can  be 
written  as  q  =  IM(q)  for  some  q  £  Q,  it  follows  that  QM  has  the  representation 

(3.17)  QH  =  {c:[t,.T]  »  [0,1]  -R"|  J  I  WkV  6  *1  J 
^  km  J 


where  the  A . km  are  appropriately  chosen  compact  subsets  of  R  . 

N 

We  then  carry  out  the  minimization  procedure  (for  J  )  described  earlier 

M 

in  this  paper  over  the  function  set  Q  ,  obtaining,  for  each  state  approximation 

index  N,  a  best  parameter  function  q  (M)  in  Q  .  Due  to  the  compactness  of 

QM,  we  can  extract  a  subsequence  (we  relabel  to  avoid  subsequence  notation) 

converging  to  some  limit.  *  ■motion  q(M)  in  QM,  i.e.,  q.(M)  =  lim  q^(M), 

1  N-  - 
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P 


i  =  l,...,n.  From  our  earlier  discussions  (Theorem  3.5),  q(M)  provides  a 
minimum  for  J  over  Q^.  From  the  definition  of  Q^,  it  follows  there  exists 
q(M)e  Q  such  that  q(M)  =  I  (q(M))  for  each  M.  From  the  compactness  of  Q, 
we  can  extract  a  subsequence  q(MJ)  such  that  q(MJ)  converges  (in  C  norm)  as 
-*■  ®  (recall  M  =  (M-j.Mg)  and  hence  by  MJ  -*•  °°  we  shall  mean  ®  and 
-*■  «)  to  some  element  q*  in  Q. 

To  argue  that  q*  is  a  solution  to  our  original  problem  of  minimizing  J 

2 

over  Q,  we  need  further  assumptions.  Specifically  we  assume  Qc  H  with  the 
2  2 

quantities  } Dxq j  ,  J D^q |  ,  |DxDtq|  uniformly  bounded  as  q  ranges  over  Q. 
Then  we  also  find  that  1^  (q(MJ ) )  -►  q*  in  the  topology  since 

(3.18)  |IMJ(q(MJ'))  -  q*|  <  |IMJ(q(Mj))  -  q(Mj)|  +  |g(MJ')  -  q*| 


<  Q.  +  lq(Mj)  -  q*| 

(Mj) CM') 

|  and  this  last  expression  (the  first  term  of  which  follows  from  a  minor 

modification  of  Theorem  2.7  of  [27,  p.  19])  approaches  zero  as  -*•  «. 

Recalling  that  J(q(M))  £  J(q)  for  all  q€  QM,  and,  given  the  fact  that 
QM  =  IM(Q),  thus 


(3.19)  J (q(M) )  <  J(IM(q))  forallqeQ, 


we  next  observe  that  the  same  basic  spline  result  used  in  (3.18)  yields 
I  (p)  -*■  q  for  any  q€  Q.  Furthermore,  since  q(M)  =  l  (q(M))  for  all  M  and 
IMJ(q(M^))  -*•  q*,  we  may  use  the  continuity  of  J  (in  the  sense)  to  take 
limits  in 
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J(IMj(q(Mj)))  <  J(IMJ(q)),  q£Q, 
to  obtain 

J(q*)  £  J(q)  for  all  q  €  Q. 

We  thus  find  that  q*  =  lim  q(M^)  =  lim  IMJ(q(M^))  =  lim  q(M^)  = 

MJ+co  m3-*-  00  M^-*-00 

-Nk  i 

]im  lim  q  (Nr)  provides  a  minimum  for  J  over  Q.  We  summarize  our  findings 

MJ-*-  «>  N.  -+  °° 
k 

in  the  formal  statement: 

2  2  2 

Theorem  3.6.  Suppose  QC  H  with  [  D^q  I  »  |D£q|,  |  D^D^q  |  uniformly  bounded 

M  M/  x 

for  q  €  Q.  Further  assume  that  Q  is  compact  in  its  C  topology  and  let  Q  =  I  (Q) 

-N 

denote  the  linear  spline  approximation  sets  as  given  in  (3.17).  Let  q  (M)  be 

a  solution  obtained  from  minimizing  JN  (see  (3.4))  over  QM.  Then  there  exists 

★  Nj/  • 

q*  in  Q  minimizing  J  over  Q  and  a  subsequence  q  k(MJ)  with  q*=  lim  (  lim  q  K(MJ)). 

m3->-<» 

We  next  consider  approximating  the  parameter  functions  with  bicubic 

elements.  That  is,  we  assume  the  basis  elements  g.  and  p  in  (3.17)  are  those 

k  m 

for  St(M^)  and  S*^),  the  cubic  splines  [27,  p.  44,  48,  49]  on  [t-j,T]  and 

M 

[0,1],  respectively.  The  corresponding  bicubic  interpolating  map  I  is 

2 

continuous  in  the  C  topology  on  Q  (the  interpolation  map  involves  functional 
values  of  q  at  interior  grid  points  of  [t ^ ,T]  *  [0,T],  values  of  D^q,  Dxq  at 
lateral  boundaries,  and  those  of  D.D  q  at  the  corners  --  see  [27,  p.  49,  50]). 

t  A 

2 

If  we  assume  that  Q  is  C  compact,  we  again  can  obtain  a  representation  for 

QM  =  IM(0)  of  the  form  (3.17)  where  the  coefficients  range  over  compact  sets 

in  R^.  To  make  arguments  analogous  to  those  involving  (3.18),  (3.19),  we 

4  4  4  2  2, 

need  only  to  further  as^u^e  Q  C  H  with  D  q|  ,  1 D. q ;  ,  |D  D  q|  uniformly 

X  L  L  X 


4  2 

bounded  for  q  £  Q  and  employ  Theorem  4.9  (which  holds  with  PC  ’  replaced  by 
H^)  of  [27,  p.  60].  The  following  corollary  to  the  above  results  is  thus 
obtained. 

Corollary  3.7.  Suppose  Q  C  H  is  compact  in  the  C  topology  with  |D  q|, 

|D^qj,  I D^D^q |  uniformly  bounded  for  q  £  Q.  Let  =  i'^(Q)  denote  the 
bicubic  spline  approximations  to  Q  as  defined  above.  Then  the  conclusions 
of  Theorem  3.6  are  also  valid  in  this  case. 

If  the  parameter  functions  are  separable,  i.e.,  q. (t,x)  =  h. (t)c^(x),  we 

can  relax  the  assumptions  in  the  above  theorem  and  corollary.  For  example 

suppose  Q-j  C  {h  :  [  t-,  ,T]  Rn}  and  Q  {c  :  [0,1]  -*•  Rn}  are  given  subsets  of  h"* 

with  both  ,  Q2  compact  in  their  respective  C  topologies  and  let  Q  =  Q-|  ©Q2- 

That  is,  Q  =  {q  =  h  ©  c  |  h  €  Q-j ,  c  €  Q2)  so  that  the  components  of  q  €  Q  are 

given  by  q-(t,x)  =  h.(t)c.(x),  i  =  1 ,2 , . . .  ,n.  We  further  assume  that  |DhL, 
111  ^ 

[  0c  [  2  are  uniformly  bounded  in  he  Q^,  c  G  Q^. 

Mi  M? 

Let  I.  and  I  denote  the  piecewise  linear  interpolating  splines  [27,  p.  11] 
*  x  T-t, 

on  [tpT]  and  [0,1]  corresponding  to  mesh  sizes  — and  1/M2,  respectively. 

Mi  Mo  I 

Then  1^.  and  1^  are  continuous  from  Q-j  and  Q2  (with  their  C  topologies)  to 

t  .  y  Mi  Mi  Mo  Mo 

Ll(M-j)  and  L*(M2),  respectively,  and  both  Q-j  e  (Q^)  and  Q2  e  Ix  (Q2)  are 

compact  with  representations 

M,  f  V1  'l 

V  *  (h:  [t,.T]  *  n"  ■  hf(t)  -  ^  «1k9k(t).  aike  afk ] 


and 


M2+l 


Mo  f  n  l 

02  =  {  c  :  [0,1]  -  Rn  |  c.(x)  =  l  YimPn(x), 
^  m=l 

whereQf^,  Pim  are  compact  subsets  of  r"*. 


Y  •  C 
im 


r,ra} 
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N 

We  then  can  carry  out  the  minimization  (for  each  N)  of  J  over 
M  Ml  Mo  M  M  _M 

Q  =  Q1  ©  Qg  ,  obtaining  minimizers  q  (M)  =  ri  (M^)  ©c  (M2).  The  compactness 

M1  m2  rNk 

of  Q-|  and  Q2  imply  the  existence  of  convergent  subsequences  {n  (M^)}, 

N,,  Nb  -  _Nl  _  _  -  _  m 

(c  K(Mg)}  with  R  k(M-j)  -*■  h(M^),  c  *(Mg)  *  c ( Mg )  where  q(M)  =  h(M^)  ©  c(Mg)  . 

M1  , 

It  follows  that  there  exist  h(M^),  c(Mg)  in  Q^ ,  Qg  such  that  h(M-j)  =  1^  (h(M^)), 

M?  , 

c ( Mg )  =  I  ” ( c ( Mg ) ) .  Since  ,  Qg  are  compact,  we  may  obtain  convergent  sub- 

*  A  00 

sequences  of  {h(M^ ) { c( M^ ) T M  _-|  (relabeling  if  necessary,  we  again 


call  them  {h(M, )},  (c(Mg)})  converging  as  -»•  Mg  -*■  <»  to  h*,  c*  in  ,  Qg 
respectively.  It  follows  that  (here  we  distinguish  between  the  Lg  and  norms) 

M, 


l”1(h(M1))  ©  l'x'2(c(M2))  -  h 


★  * 
©  c 


1  |It1(h(M1))  ©  l‘x2(c(Mg))  -  h^)  ©  c(Mg)|g+  |  h(M1 )  ©c(Mg)  -  h*©c*|g 


M 


1  | It1(h(M1))  ©  [Ix2(c(Mg))  -  c(Mg)]|g 


+  | [It1(h(M1 ))  -  h(M1)]  ©  c(Mg) | g  +  |h(M1)  ©  c(Mg)  -  h"  ®  c"|2 


rMl/: 


<  K|It,(h(M1))|oo  |Ix2(c(Mg))  -  c(Mg)  |  g 


+  K | C ( Mg ) | ^  | It  ,(n(M1 ))  -  h(M1) |2  +  | h(M] )  ©  c(Mg)  -  h*  ©  c*|g 


^  (M/Mg)  sup  !  Dc  1  g  +  (Ai ( T- 1 1  )/M-| )  sup  |  Dh ]  g 


ceQ 


+  I  h(M-| )  ©  r(Mg)  -  h  ©  c  |  g 


0  as  M^  Mg 


h*Q, 
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where  the  first  two  terms  in  the  last  estimate  follow  from  Theorem  24  of 
[27,  p.  16]. 

From  the  above  one  can  conclude  (the  arguments  are  similar  to  those 
following  (3.18))  that  q*  =  h*  ®  c*  is  a  solution  of  the  original  problem  of 
minimizing  J  over  Q.  We  thus  have: 


Theorem  3.8.  Suppose  Q  =  Q-j  ®  Q2  with  Q.  C  h  ,  Q.  compact  in  its  C  topology 

,  ,  ,  >  m  Ml  M? 

with  |Dh|  ,  ] Dc (  uniformly  bounded  for  h£  ,  c£  1)2-  Let  Q  =  Q-j  ®  Q2 

be  the  (linear)  spline  approximation  to  Q  as  defined  above  and  let  q^(M)  = 

h^(M-j)  ®  c^^)  be  a  solution  to  the  problem  of  minimizing  over  QM.  Then 
*  *  * 

there  exists  q  =  h  <s>  c  in  Q  which  provides  a  minimum  for  J  over  Q  and  a 


subsequence  q^(MJ)  such  that  q  =  h  ®  c*  =  lim  /  lim  h^M'j)  ®  c  ^(Mi)'] 

..i  \  N.-+«,  1  L  • 


A  J\  « 


J 

Mg-*-  00 


There  is  an  obvious  corollary  to  this  theorem  involving  use  of  cubic 
splines.  I f  Q  =  0^<S>  and  we  wish  to  use  the  cubic  splines  St(M^)  and 

v 

S  (M2)  to  approximate  the  sets  and  C^,  respectively,  the  conclusions  of 

? 

Theorem  3.8  are  valid  under  the  assumptions  that  Q.  c  H  ,  compact  in  its 
C  topology  with  ]  D  h  j  ,  |  D  c  |  uniformly  bounded  for  he  Qp  c€  Q2  (compare 
with  the  assumptions  of  Corollary  3.7).  Finally  one  can  also  wish  (in  the 
separable  Q  =  ®  Q2  case)  to  approximate  one  parameter  set  (say  Q^)  by 

linear  spline  elements  ( L ) )  while  approximating  the  other,  C^,  by  cubic 
elements  (S  (M2))  (see  the  examples  in  section  4).  Obvious  analogues  of  the 
above  convergence  results  can  be  easily  given  in  these  cases  also. 


§4  Numerical  results 


In  this  sectioh  we  present  our  numerical  findings  for  a  number  of 
representative  parameter  estimation  problems,  including  an  example  that 
illustrates  how  our  spline-based  methods  can  be  used  to  identify  variable 
diffusion  and  convection  parameters  in  a  dispersion  model.  The  goal  of  our 
efforts  was  to  ectablish  the  effectiveness  of  these  methods  as  they  are  used 
to  determine  a  variety  of  parameters  that  depend  on  time  and/or  spatial 
variables,  and  to  observe  convergence  properties  in  several  different  examples. 
To  this  end,  we  developed  a  Fortran  software  package  (a  modification  of  one 
created  by  Dr.  James  Crowley  when  he  was  a  student  at  Brown  University)  to 
identify  parameters  in  the  approximate  ODE  system  (3.11);  all  test 
computations  were  carried  out  on  the  CDC  6600  at  Southern  Methodist  University. 

The  numerical  examples  that  appear  in  this  section  were  formulated  in 
the  following  manner:  "True"  parameter  functions  q  =  (q-j ,  q2»  q3)  and  a  true 
solution  u(t,x)  were  chosen  for  each  example  a  priori.  (In  each  case, 
u(t,0)  =  u(t,l)  -  0,  although  only  Example  4.1  satisfies  the  hypothesis  that 

3 

the  initial  data  <p  lie  in  Hq.  Indeed,  it  is  not  surprising  that  the  methods 

under  investigation  perform  admirably  when  applied  to  examples  that  satisfy 

weaker  assumptions  than  those  we  used  in  arguing  theoretical  convergence.) 

An  r  x  (s-1)  grid  of  sample  data  points  y. .  was  determined  by  setting 

J 

y..  =  u(t.,x.)  (with  random  noise  added  in  Example  4.1)  where  t.  =  (i-l)T/r, 

1  J  J 

i  =  l,...,r,  and  x.  =  j/s,  j  =  l,...,s-l.  The  estimation  problem  then 
becomes  that  of  determining  q  €  q  (and  u(q))  that  minimizes 
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where  u  satisfies  (2.1)-(2.3);  for  each  example,  4>  and  f  were  given  at  the 
outset  by  <*>(x)  =  u(o,x)  and 


2- 


f(t,x)  =  ~  -  q-j  (t,x)  ~  -  q2(t,x)  -  q3(t,x)u(t,x). 

d  ax  u 


We  remark  here  that  the  cost  functional  J  defined  above  is  not  even  a  special 
case  of  the  distributed  least  squares  fit-to-data  functional  given  in  (2.4). 

We  could  have,  of  course,  interpolated  our  data  {y .  . >  and  used  (2.4);  this 

'  J 

would  have  kept  us  in  the  framework  developed  above.  We  didn't  do  this  since 

it  has  been  our  experience  that  one  actually  obtains  a  much  stronger  convergence 

(pointwise  in  x)  for  z  (t;q)  =  u  (t,*,q)  than  the  norm  convergence  that  is 

usually  guaranteed  a  priori  by  the  theory  (for  cases  where  one  can  establish 

the  stronger  convergence  see  Lemma  2.4  and  Theorem  2.1  of  [7]).  In  addition, 

we  have  in  our  numerical  tests  violated  the  assumption  that  the  first  sample 

time,  t-|,  be  strictly  positive  (t1  =  0  for  all  of  our  examples).  No  real 

problems  are  posed  by  these  discrepancies  however:  As  was  noted  earlier, 

this  particular  restriction  may  be  relaxed  if  we  are  willing  to  assume 

additional  smoothness  on  the  coefficients  q.  Continuity  requirements  of  this 

sort  are  certainly  met  in  the  test  examples  chosen  here. 

Several  different  finite-dimensional  representations  for  the  functional 

parameters  were  discussed  in  detail  in  section  3.  We  assume  throughout  this 

section  that  q^(t,x)  =  h..(t)c^(x)  and,  for  a  given  N,  attempt  to  identify 

"optimal"  approximations  for  h.  in  Lt(M1)  or  St(M^)  and  in  LX(M2)  or  SX(M2), 

-N 

i  =  1,2,3;  that  is,  given  initial  guesses  for  and  y^m,  we  determine  a.^ 
and  k  =  1 , . . . ,K] (M1 ) ,  m  =  1 , . . . ,K2(M2),  so  that 
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s 

« 


x 


-N 


«1 


hi(t)  *  JikBk(t) 


and 


-N 


"2 


c?(x)  =  l  p  (x) 
i  'mrr  ' 


m=l 

are  "optimal"  in  the  sense  described  in  previous  sections.  For  each  example, 

-N 

we  take  M^  and  Mg  fixed;  thus  there  should  be  no  confusion  in  writing  and 

-N  _N  -N 

Y-jm  throughout,  instead  of  the  intended  meaning,  a^(M^)  and  y ^ m ( Mg ) -  We  remark 

here  that  this  choice  of  approximation  for  h. ,  c^  simplifies  the  effort  required 

to  implement  the  approximating  parameter  estimation  problems.  Because  standard 

parameter  identification  packages  require  that  (3.11),  the  system  in  w  ,  be 

solved  for  each  updated  value  of  q,  the  inner  product  matrices  ,  i  =  1,2,3, 

need  to  be  recomputed  for  each  q- iterate.  This  is  a  simple  matter  given  the 

representations  chosen  here;  for  example,  the  (t.j)-entry  of  K-j  becomes 

h,(t)<c,D2BN,  B^>  -  (  ^  „)kBk(t)  j  j*  T,„  <p/b 1$,  B"> 


so  that  the  inner  products  need  to  be  computed  only  once  (prior  to  the  iterative 
process)  and  simply  combined  appropriately  as  and  are  updated. 

Our  experience  with  numerous  numerical  examples  has  indicated  that  greater 
accuracy  is  required  for  the  spline  representation  for  c..  than  for  h^ ,  which 

is  not  surprising  since  c^  is  involved  in  calculating  the  n.ner  products  for 

N  t  x 

K^,  etc.  For  this  reason,  we  chose  L  (M^ )  for  h..  and  S  (Mg)  for  c^  in  the 

examples  presented  here,  except  for  Example  4.3  where  LX(Mg)  is  used  for  c. 

as  well.  The  choice  of  M^  and  Mg  is  clearly  problem-dependent,  especially 

when  the  time  interval  [0,T]  is  very  large;  in  what  follows  we  pick  M-j  and  Mg 
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so  that  both  L  (M^)  and  S  (Mg)  have  exactly  four  basis  elements,  a  number  large 

enough  to  ensure  reasonably  accurate  approximations  for  h^  and  c^  yet  small 

enough  to  keep  the  number  of  unknowns  i  =  1,2,3;  k  =  1 , . . . , ; 

m  =  l,...,Kg)  at  a  manageable  level.  We  therefore  take  =  3  and  Mg  =  1 , 

except  in  Example  4.2,  where  the  four  basis  elements  for  S  (Mg)  are  modified 

to  satisfy  homogeneous  boundary  conditions  (and  thus  Mg  =  3  must  be  taken). 

For  given  values  of  N,  M-j ,  and  Mg,  we  used  ZXSSQ,  the  IMSL  version  of  the 

Levenberg-Marquardt  algorithm,  to  iteratively  determine  the  desired  parameters. 

Our  experience  with  this  package  has  been  relatively  good  (it  is  easily 

implemented  and  usually  performs  well  with  default  values  for  several  of  the 

required  input  variables)  so  long  as  we  did  not  need  to  estimate  more  than 

a  combined  total  of  7  or  8  unknown  coefficients  in  the  approximation  for  tv 

and  Cj.  The  package  tends  to  fail  in  attempts  to  estimate  a  larger  number 

of  parameters  and  this  may  be  due  to  the  fact  that  ZXSSQ  computes  all  needed 

gradients  using  a  finite  difference  scheme  (although  preliminary  testing  of 

MINPACK's  version  of  the  Levenberg-Marquardt  scheme,  LMSTR1 ,  with  more  accurate 

user-supplied  gradients,  has  not  indicated  that  this  is  necessarily  the  sole 

source  of  the  difficulty).  It  is  more  likely  that  a  large  number  of  unknown 

parameters  is  associated  with  an  excessive  number  of  degrees  of  freedom  in 

the  problem,  which  manifests  itself  in  the  inability  of  the  iterative 

-N 

scheme  to  converge  at  al 1  to  q  (M)  for  N  and  M  fixed. 

As  can  be  expected,  increasing  the  number  of  unknowns  also  increases 

the  CP  time  required,  as  well  as  the  number  of  requisite  iterations  on  q. 

It  was  also  our  experience  that  it  was  far  more  difficult  to  identify  c^ 

than  h.. ,  possibly  due  to  the  fact  that  c.  always  appears  in  integrated  form 

N 

(in  the  inner  product  matrices)  which  suggests  that  the  solution  w  to  (3.11) 
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is  less  sensitive  to  small  changes  in  c.  than  it  is  with  respect  to  changes 

in  h.j .  In  such  examples  the  parameter  estimation  process  was  frequently  quite 

expensive  (CP  time  for  N=3  problems  with  6-8  unknown  coefficients  for  hi  and 

Cj  often  exceeded  1000  seconds;  in  contrast,  it  took  only  345  CP  seconds  to 

accurately  identify  the  8  unknowns  in  Example  4.1.  where  parameters  did  not 

depend  on  the  spatial  variable).  Even  with  fewer  unknowns,  these  parameter 

estimation  problems  are  generally  more  expensive  (137  CP  seconds  for  Example 

4.1(a);  N=2)  than  those  for  many  other  distributed  systems  (see  [9],  [10]) 

because  the  approximating  ODE  systems  for  these  parabolic  PDEs  tend  to  be 

quite  stiff,  requiring  more  costly  solution  schemes  (for  example,  IMSL's 

DGEAR,  which  is  what  we  used  in  our  computations). 

We  turn  now  to  several  numerical  examples  that  are  representative  of  our 

findings.  In  example  4.1  and  4.2  we  display  qN  =  qN(M)forN  =  2,4,8,  and  16, 

(values  of  or  M2  for  M  are  fixed  throughout,  as  was  previously  indicated) 

-N 

to  demonstrate  how  quickly  q  approaches  q;  in  fact,  convergence  is  so  rapid 

-3 

in  these  examples  that  we  present  only  q  for  the  remaining  examples. 

Example  4.1 

We  consider  the  system 

ut  =  q1(t)uxx  +  q2(t)ux  +  f  ( t ,  x) ,  0<x<l,  0  <  t  _<  1  , 

'  u(0,x)  =  0 
^u(t,0)  =  u ( t , 1 )  =  0  . 

A  9x3  grid  of  sample  data  points  was  generated  from  the  "true"  values  of  the 
solution  and  parameters,  u(t,x)  =  100x(l-x)(x+4)simrt,  q-j ( t )  =  8t+l,  and 
q2(t)  =  -2t+4.  "True"  coefficient  values  for  the  representations 
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i 

1> 


L 


i  ~r 

q1  (t)  *  l  cxlkek(t)  and  q2(t)  =  £  a2kSk(t)  are  then  given  by  ®11  =  1  » 
k- 1  k- 1 


°12  '  3 3  » 


6  3  *  “14  =  9* 


4, 


1 


-  -  o2 

a23  ‘  c  3  ’ 


“24  =  2 


13  3  *  “14  **  “21  "  “22  "  “3 

(in  each  case,  aik  =  ci.jk(M),  i  =  1,2 ;  k  =  1,..., 4).  Several  different 
estimation  problems  were  investigated. 


(a)  We  seek  to  estimate  q-j  with  q2  =  q2  given.  For  our  initial  guess 
we  take  q^  =  .001;  that  is,  a^k  =  .001,  k  =  1,...,4. 


N 

-N 

-N 

-N 

-N 

“11 

“l  2 

“13 

“14 

2 

0.998581 

3.629449 

6.250314 

8.880572 

4 

0.999974 

3.666598 

6.333214 

8.999806 

8 

0.999975 

3.666598 

6.333214 

8.999806 

16 

0. 999974 

3.666598 

6.333214 

8. 999806 

We  repeat  the  computations  in  (a) 

except  that  we 

corrupt  the  "data" 

with 

random  noise 

(varying  between 

-  1  and  +  1  ) . 

N 

-N 

-N 

-N 

-N 

all 

“12 

“13 

“14 

2 

1.044241 

3.605704 

6.282895 

8.932719 

4 

1.046339 

3.642512 

6.366286 

9.051935 

8 

1.045438 

3.640817 

6.368787 

9.049063 

16 

!  1.041110 

3.643830 

6.366950 

9.050610 

We  estimate  both  q 

^  and  q2,  using 

start-up  values 

of  q-j  =  5.0 

and  q2  =  1.5  (olk 

=  5.0,  «2k  =  1.5 

,  k  =  1 . 4). 

The  results  are 

reported  in  Table  4. 1 . 
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(d)  We  repeat  the  calculations  in  (c)  except  we  corrupt  the  data  with 
random  noise  as  in  (b).  Table  4.2  contains  the  outcome  of  this 
investigation.  The  addition  of  random  noise  in  this  case  yields 
satisfactory  estimates  for  all  coefficients  except  a24,  which  appears 
to  be  converging  to  a  value  far  from  ^4  =  2.  We  were  able  to 
determine  however  that  J(q^)  <  J (q )  for  this  example  so  that  the 
parameters  and  q2  that  provide  the  best  fit  to  the  new  (noisy) 
data  are  not  merely  slight  deviations  from  the  parameters  that 


generated  the  original  (noise-free)  data. 


Example  4.2 


We  consider 


ut  =  qj(x)uxX  +  q2(x)ux  +  f(t,x),  0  <  x  <  1,  0  <  t  <_  2 


u(0,x)  =  -  20(x  -x) 


|^u(t,0)  =  u(t,l)  =  0 

with  q-j(x)  =  x-x^,  q2(x)  =  8x  -  8x^  and  true  solution  u(t,x)  =  -  10(x^  -  x)(t^  +  4t  +  2) 

-N  -N 

A  9x3  grid  of  sample  data  was  used  to  search  for  estimates  q^  and  q2  in 

x  r  4 

S0(3)  X  YimPm  I  Pm  is  a  cubic  B-spline  modified  so  that  pm(0)  =  pm(l)  =  0, 


m  =  1 


. 4} 


(the  spline  elements  pm  are  defined  exactly  the  same  as  for 


M  =  3,  where  Bm  was  given  in  section  3).  We  compare  our  computed  values  with 

M  3 

those  corresponding  to  the  projections  P  q.  =  P  q^  which  have  the  coefficients 
Yn  =  0.01698,  y12  =  0.04321  ,  y]3  =  0.04321  ,  y]4  =  0.01698,  and  y2]  =  0.13580, 

Y22  =  0.34568,  y23  =  0.34568,  y24  =  0.13580,  respectively. 


(a)  We  seek  to  identify  q2  with  =  q^  given.  The  initial  guess  for 
q2  ^  =  i • e • »  Y 2jp  —  ® ^  —  1»*»»»4. 


N 

-N 

-N 

-N 

-N 

Y21 

Y22 

y23 

y24 

2 

-0.14416 

0.45511 

0.74438 

-0.37807 

4 

0.13599 

0.34556 

0.34569 

0.13577 

8 

0.13581 

0.34569 

0.34566 

0.13580 

16 

0.13579 

0.34571 

0.34566 

0.13580 

We  estimate  q^  with  q^ 

=  q^  given 

(for  start-up 

values,  y i -j  =  0.05 

Y12 

1.5,  y13 

2.0, 

and  y14  =  2. 

5  are  used). 

The  results  for 

N  = 

3  are  given 

below. 

-3 

-3 

-3 

-3 

Y11 

y12 

Y1 3 

y14 

0.01697 

0.04321 

0.04322 

0.01696 

Example  4.3 

We  present  now  our  parameter  estimation  findings  for  a  problem  with 
parameters  that  depend  on  both  time  and  space  variables.  We  consider  the 
system 

ut  =  q^t.xju^  +  8(x-x2)ux  +  f(t,x),  0  <  x  <  1 ,  0  <  t  <  2 

'  u(0,x)  =  -  20(x2-x) 

^  u(t,0)  =  u(t,l)  =  0. 

A  9x  3  grid  of  sample  data  points  was  generated  from  the  true  solution 
2  2 

u(t,x)  =  -  10  (x  -x)(t  +  4t  +  2).  Assuming  the  "true"  value  of  q^  is  given  by 


€ 

-  . 

X 


q^(t,x)  =  tx  (so  that  h-j(t)  =  t  and  c^(x)  =  x),  the  "true"  basis  coefficients 

t  X 

for  the  representations  of  h^  in  L  (3)  and  c^  in  L  (3)  are  then  given  by 

-2-  1-  -  -1-2- 
“11  =  °’  “12  =  3’  “13  =  1  3  ’  “14  =  2’  and  Y11  =  °»  Y12  =  3  ’  Y13  =  3  ’  Y14  =  1 

Several  different  estimation  problems  were  considered,  using  the  state  variable 

approximation  index  N  =  3  throughout. 

(a)  We  estimate  h^  assuming  c^  =  c^  is  fixed.  To  begin  the  estimation 
process  we  specify  the  initial  guess  as  h-j(t)  =  3  -  j  t;  i.e., 

—  3,  ct i ^  ~  2,  ot^ ^  1,  and  —  0. 

-3  -3  -3  -3 

“11 _ ^12 _ ^13 _ “14 

0.0003  0.6665  1.3338  2.0005 

(b)  We  wish  to  estimate  Cj  with  h^  *=  h^  given.  Start-up  values  of 

1  3 

Y11  ~  2'  y12  =  ^  ’  y13  =  2  *  and  y14  =  2  corresponding  to  an  initial 
3  1 

guess  of  c^ (x)  =  ^  x  +  j  were  used. 

-3  -3  -3  -3 

Y11 _ 112 _ H3 _ Y14 

0.0061  0.3295  0.6727  0.9817 

(c)  We  estimate  and  part  of  c -j ,  assuming  that  the  first  two  basis 
coefficients  of  c^  are  given  by  y -j -j  =  and  =  y-^  •  Start-up 

values  for  the  remaining  coefficients  are  =  .5,  = 

“i3  ""  “l4  ~  ^ •  ’  Yi3  -  1  •  S  and  y-|^  -  2. 

-i  -3  -3  -3  -3  -3 

“11  “12  “13  “14  Y1 3  y14 


0.0000  0.6685  1.3371  2.0054  0.6642  0.9972 


We  conclude  the  section  on  numerical  results  by  returning  to  an  example 
similar  to  the  dispersal  model  described  in  the  introduction.  To  demonstrate 
the  effectiveness  of  our  methods  for  systems  of  this  type  we  have  chosen  model 


equations  for  which  the  diffusion  and  convection  coefficients  are  realistic 
from  the  point  of  view  of  an  intended  application  (see  the  discussions  in 
[12]).  A  report  on  our  efforts  to  use  actual  field  data  to  estimate  variable 
diffusion,  convection,  and  source/sink  parameters  in  systems  of  the  form  (1.1) 
will  appear  elsewhere.  Consider 


'  ut  +  (l/(t,x)u)x  =  (0ux)x  +  g(t,x),  0  <  x  <  1 ,  0<tj<2, 

,  u(0,x)  =  -  400x(x-l) 
u(t,0)  =  u(t,l)  =  0 


where  the  "true"  distribution  of  insects  is  given  by  u(t,x)  =  -  400x(x-l )cos j 
and  a  constant  diffusion  coefficient  is  given  by  V  =  20.  The  time  and  spatially 
varying  convection  coefficient  l/( t , x )  =  -  100(x-. 5) (4-t)  is  chosen  to  suggest 
an  attraction  of  insects  to  the  point  x  =  .5  (the  densely  vegetative  center 
of  a  linear  array  of  plants)  and  to  indicate  a  decreasing  attraction  to  the 
region  as  time  increases  (a  reasonable  model  if  the  quantity  of  foodstuffs  is 
decreasing  over  time).  As  before,  9x3  data  points  are  generated  using  u, 
and  the  source/sink  term  is  artificially  defined  to  be  g(t,x)  =  u.  -  (l/(t,x)u) 

^  X 


<«*>*  ■ 

We  remark  here  that  although  the  theory  developed  in  sections  2  and  3 
(for  equations  in  the  form  of  (2. 1 )-(2. 3))  may  be  applied  to  this  problem  with 
q,  =  V  >  q^ 


Px  -  V,  an u  q3 


l/x,  we  cannot  claim  that  the  convergence 


47 


of  q^(M)  to  q_j(M),  i  =  1,2,  guarantees  that  t/^(M)  ->  i'(M)  (where  y^(M)  =  -  q^(M)  + 

-N  - 

(q-j (M) )x  and  U(M)  =  -  q2(M)  +  (q^ (M) )x  unless  we  also  have  the  convergence  of 
-N 

(q-j(M))x  to  (q-|(M))x.  We  certainly  have  that  in  the  cases  we  consider  here. 

In  what  follows  we  let  i/(t,x)  =  h2(t)c2(x)  where  h2(t)  =  4  -  t  and 

c2(x)  =-  100(x-  .5).  True  basis  coefficients  for  h2  as  an  element  of  ( 3) 

1  -  2 

are  given  by  =  4,  a22  =  3-^  ,  a22  =  2-j  ,  and  =  2.  The  basis  coefficients 

for  Pc2  e  Sx(l)  (for  P  the  usual  L2  projection  of  l_2  onto  Sx(l))  are  used 
for  comparison  as  the  "true"  values  of  y2m,  m  =  1,...,4;  those  values  are 
given  by  y21  =  25,  y22  =  »  ^23  =  “8I’  and  ^24  =  ~  25‘  In  addition,  we 

allow  time  varying  estimates  for  V,  taking  V  €.  L^O).  "True"  basis  coefficients 
for  V  =  20  are  ct^  =  20,  k  =  1,...,4.  The  following  problems  were  studied 
and  results  obtained. 

(a)  We  estimate  V  with  \J  -  \J  given.  A  time-varying  initial  guess  of 
V  =  40- 1 5t  (start-up  values  are  =  40,  =  30,  =  20,  and 

a14  =  10)  is  used. 


19.9998 


19.9996 


19.9997 


19.9997 


(b)  We  seek  an  estimate  for  h2,  assuming  that  c2  =  c2  and  V  -  V  are 
given.  Supposing  that  a  priori  information  about  1/  indicates  that 
it  might  depend  on  the  spatial  variables  only,  we  take  an  initial 
guess  of  h2  h  1  (or  ct^  =  1 ,  k  =  1 , . . .  ,4) .  The  N  =  3  estimates  for 
a 2^  reveal  that  this  is  indeed  not  the  case. 


4.0001 


3.3335 


2.6668 


2.0001 
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(c)  We  estimate  c ^  with  and  V  =  V  fixed.  Start-up  values  are 

Y  2m  ~  ®  *  ***  “  1 » . . . » 4 . 


-3 

Y21 


-3 

y22 


-3 

y23 


25.0201 


8.3373 


-8.3390 


-3 

y24 


-24.9459 


(d)  We  attempt  to  identify  V  with  V  =  V  given.  Since  V  is  the  product 
of  h2  and  C2>  at  least  one  of  their  basis  coefficients  must  be 
fixed  or  there  will  be  an  infinite  number  of  possible  combinations 
of  1^  and  C2  that  still  yield  1^2  =  I/.  Here  we  fix  Y21  =  dncl 
y22  =  y22'  Start-up  values  for  the  remaining  coefficients  are 
-  1j  k  -  1 , . . .  ,4,  and  Y23  S ,  y  24 


-3 

“21 


-3 

“22 


-3 

a23 


-3 

“24 


-3 

y23 


-3 

Y24 


4.0005 


3.3338  2.6670 


2.0003 


-8.3333  -24.9883 


(e)  We  repeat  the  calculations  of  (d)  except  that  we  now  estimate  Y22 
as  well  (only  Y2i  =  Y2I  ^xecl  Start-up  values  are 

(*2 1  -  3.2,  ®22  _  ^ ^ *  a23  ~  ^  ®  ’  a24  _  ^  ^ »  ^22  -  ^ *  ^23  =  ”  ®  •  3  > 
and  y24  =  -  15. 

-3  -3  -3  -3  -3  -3  -3 

a21  a22  a23  a24  Y22  Y23  Y24 


§5.  Concluding  remarks 


In  the  above  presentation  we  have  given  a  convergence  theory  for  algorithms 
to  estimate  functional  coefficients  in  parabolic  systems.  While  we  treated 
only  scalar  equations,  the  ideas  generalize  immediately  to  vector  systems  of 
equations  and  hence  are  applicable  to  a  wide  variety  of  transport  problems  in 
addition  to  the  species  dispersal  problems  that  motivated  our  efforts  nere 
(for  example,  transport  of  several  labeled  substances  in  physiological  systems 
such  as  those  in  the  brain  transport  models  investigated  by  Kyner,  et.al.  -- 
see  [21],  [25],  [26],  [2],  [12],  [28]).  Furthermore,  we  believe  the  theoretical 
framework  presented  above  is  the  most  promising  of  several  possible  approaches 
in  attempting  to  develop  an  estimation  theory  for  certain  nonlinear  systems 
such  as  those  arising  in  transport  models  with  density  dependent  coefficients 
(we  are  currently  pursuing  developments  in  this  direction)  even  though  the 
framework  of  [7]  handles  some  nonlinearities  in  a  most  convenient  fashion. 

Our  basic  parameter  estimation  ideas  and  techniques  can  also  be  readily 
extended  to  treat  problems  with  several  spatial  variables.  We  have  already 
considered  theoretical  aspects  (with  a  positive  outcome)  and  are  currently 
developing  computational  packages  to  treat  several  specific  problems  including 
species  dispersal  in  two  dimensional  domains  and  estimation  in  large  space 
structures  (the  Maypole  Hoop/Column  antenna  --  sea  [2]  and  [10]). 

Finally,  while  we  have  not  here  emphasized  that  either  boundary  conditions 
or  initial  conditions  or  both  may  be  unknown  in  many  problems,  such  unknown 
parameters  can  easily  be  included  in  our  theory  and  algorithms.  Indeed, 
we  have  successfully  used  our  methods  to  estimate  boundary  and  initial  value 
parameters  for  both  parabolic  and  hyperbolic  systems  (see  the  discussions  and 
results  in  [3],  [5],  [6 j ,  [7],  [12]). 
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